173 research outputs found
Dual Correction Strategy for Ranking Distillation in Top-N Recommender System
Knowledge Distillation (KD), which transfers the knowledge of a well-trained
large model (teacher) to a small model (student), has become an important area
of research for practical deployment of recommender systems. Recently, Relaxed
Ranking Distillation (RRD) has shown that distilling the ranking information in
the recommendation list significantly improves the performance. However, the
method still has limitations in that 1) it does not fully utilize the
prediction errors of the student model, which makes the training not fully
efficient, and 2) it only distills the user-side ranking information, which
provides an insufficient view under the sparse implicit feedback. This paper
presents Dual Correction strategy for Distillation (DCD), which transfers the
ranking information from the teacher model to the student model in a more
efficient manner. Most importantly, DCD uses the discrepancy between the
teacher model and the student model predictions to decide which knowledge to be
distilled. By doing so, DCD essentially provides the learning guidance tailored
to "correcting" what the student model has failed to accurately predict. This
process is applied for transferring the ranking information from the user-side
as well as the item-side to address sparse implicit user feedback. Our
experiments show that the proposed method outperforms the state-of-the-art
baselines, and ablation studies validate the effectiveness of each component
Understanding Grip Shifts: How Form Factors Impact Hand Movements on Mobile Phones
In this paper we present an investigation into how hand usage is affected by different mobile phone form factors. Our initial (qualitative) study explored how users interact with various mobile phone types (touchscreen, physical keyboard and stylus). The analysis of the videos revealed that each type of mobile phone affords specific handgrips and that the user shifts these grips and consequently the tilt and rotation of the phone depending on the context of interaction. In order to further investigate the tilt and rotation effects we conducted a controlled quantitative study in which we varied the size of the phone and the type of grips (Symmetric bimanual, Asymmetric bimanual with finger, Asymmetric bimanual with thumb and Single handed) to better understand how they affect the tilt and rotation during a dual pointing task. The results showed that the size of the phone does have a consequence and that the distance needed to reach action items affects the phones’ tilt and rotation. Additionally, we found that the amount of tilt, rotation and reach required corresponded with the participant’s grip preference. We finish the paper by discussing the design lessons for mobile UI and proposing design guidelines and applications for these insights
Exploration in Gradient-Based Reinforcement Learning
Gradient-based policy search is an alternative to value-function-based methods for reinforcement learning in non-Markovian domains. One apparent drawback of policy search is its requirement that all actions be 'on-policy'; that is, that there be no explicit exploration. In this paper, we provide a method for using importance sampling to allow any well-behaved directed exploration policy during learning. We show both theoretically and experimentally that using this method can achieve dramatic performance improvements
Understanding face and eye visibility in front-facing cameras of smartphones used in the wild
Commodity mobile devices are now equipped with high-resolution front-facing cameras, allowing applications in biometrics (e.g., FaceID in the iPhone X), facial expression analysis, or gaze interaction. However, it is unknown how often users hold devices in a way that allows capturing their face or eyes, and how this impacts detection accuracy. We collected 25,726 in-the-wild photos, taken from the front-facing camera of smartphones as well as associated application usage logs. We found that the full face is visible about 29% of the time, and that in most cases the face is only partially visible. Furthermore, we identified an influence of users' current activity; for example, when watching videos, the eyes but not the entire face are visible 75% of the time in our dataset. We found that a state-of-the-art face detection algorithm performs poorly against photos taken from front-facing cameras. We discuss how these findings impact mobile applications that leverage face and eye detection, and derive practical implications to address state-of-the art's limitations
Adapting Text-based Dialogue State Tracker for Spoken Dialogues
Although there have been remarkable advances in dialogue systems through the
dialogue systems technology competition (DSTC), it remains one of the key
challenges to building a robust task-oriented dialogue system with a speech
interface. Most of the progress has been made for text-based dialogue systems
since there are abundant datasets with written corpora while those with spoken
dialogues are very scarce. However, as can be seen from voice assistant systems
such as Siri and Alexa, it is of practical importance to transfer the success
to spoken dialogues. In this paper, we describe our engineering effort in
building a highly successful model that participated in the speech-aware
dialogue systems technology challenge track in DSTC11. Our model consists of
three major modules: (1) automatic speech recognition error correction to
bridge the gap between the spoken and the text utterances, (2) text-based
dialogue system (D3ST) for estimating the slots and values using slot
descriptions, and (3) post-processing for recovering the error of the estimated
slot value. Our experiments show that it is important to use an explicit
automatic speech recognition error correction module, post-processing, and data
augmentation to adapt a text-based dialogue state tracker for spoken dialogue
corpora.Comment: 8 pages, 5 figures, Accepted at the DSTC 11 Workshop to be located at
SIGDIAL 202
- …